Signature File Hashing Using Term Occurrence and Query Frequencies
نویسندگان
چکیده
Signature files act as a filter on retrieval to discard a large number of non-qualifying data items. Linear hashing with superimposed signatures (LHSS) provides an effective retrieval filter to process queries in dynamic databases. This study is an analysis of the effects of reflecting the term occurrence and query frequencies to signatures in LHSS. This approach relaxes the unrealistic uniform frequency assumption and lets the terms with high discriminatory power set more bits in signatures. The simulation experiments based on the derived formulas explore the amount of page savings with different occurrence and query frequency combinations at different hashing levels. The results show that the performance of LHSS improves with the hashing level and the larger is the difference between the term discriminatory power values of the terms, the higher is the retrieval efficiency. The paper also discusses the benefits of this approach to alleviate the imbalance between the levels of efficiency and relevancy in unrealistic uniform frequency assumption case, AKTUG, CAN: Signature File Hashing Using Term Occurrence and Query Frequencies p. 2
منابع مشابه
Dynamic Signature File Partitioning Based on Term Characteristics
Signature files act as a filter on retrieval to discard a large number of non-qualifying data items. Linear hashing with superimposed signatures (LHSS) provides an effective retrieval filter to process queries in dynamic databases. This study is an analysis of the effects of reflecting the term query and occurrence characteristics to signatures in LHSS. This approach relaxes the unrealistic uni...
متن کاملAnalysis of Signature Generation Schemes for Multiterm Queries In Partitioned Signature File Environments
Our analysis explores the performance of three superimposed signature generation schemes as they are applied to a dynamic sigrtature file organization based on linear hashing: Linear Hashing with Superinzposed Signatures (LHSS). First scheme (SM) allows all terms set the same number of bits whereas the second and third methods (MMS and MMM) emphasize the terms with hlgh discriminatory power. In...
متن کاملDesign of a Signature File Method that Accounts for Non-Uniform Occurrence and Query Frequencies
In this paper we study a variation of the signature Ale access method for text and attribute retrieval. According to this method, the documents (or records) are stored sequentially in the “text flle”. Abstractions (“signatures”) of the documents (or records) are stored in the “signature Ale”. The latter serves as a Alter on retrieval: It helps discarding a large number of non-qualifying documen...
متن کاملVertical Framing of Superimposed Signature Files Using Partial Evaluation of Queries
A new signature file method, Multi-Frame Signature File (MFSF), is introduced by extending the bit-sliced signature file method. In MFSF, a signature file is divided into variable sized vertical frames with different on-bit densities to optimize the response time using a partial query evaluation methodology. In query evaluation the on-bits of the lower onbit density frames are used first. As th...
متن کاملPartial Evaluation of Queries for Bit-Sliced Signature Files
Our research extends the bit-sliced signature organization by introducing a partial evaluation approach for queries. The partial evaluation approach minimizes the response time by using a subset of the on-bits of the query signature. A new signature file optimization method, Partially evaluated Bit-Sliced Signature File (P-BSSF), for multi-term query environments using the partial evaluation ap...
متن کامل